Value at Risk (VaR)

import sys
from pathlib import Path
sys.path.append(str(Path.cwd().parent))

import numpy as np
import pandas as pd
import datetime as dt
import seaborn as sns
import matplotlib.pyplot as plt
from scipy import stats
from const.numbers import *
import const.literals as c
from utils.logger import get_logger
from utils.fin_utils import calculate_parametric_var, age_weight_series


logger = get_logger(__name__)

Value at Risk (VaR) is a statistical measure of the potential financial loss in a portfolio over a specific time period at a given confidence level. It answers the question: “What is the maximum loss we can expect with X% confidence over Y time period?”

For example, a one-day 95% VaR of $1 million means there is a 95% probability that the portfolio will not lose more than $1 million in one day.

Normal Distribution Assumption

While financial returns rarely follow a perfect normal distribution, they are often approximated as a linear combination of a standard normal distribution for practical purposes. This approximation is expressed as:

\[ r(t) = \mu + \sigma \cdot N(0, 1) \]

where: - $r(t)$ is the return at time $t$ - $\mu$ is the expected return (mean) - $\sigma$ is the volatility (standard deviation) - $N(0,1)$ is a standard normal distribution with mean 0 and variance 1

This assumption simplifies VaR calculations but should be used with awareness of its limitations, particularly during market stress when returns tend to exhibit fat tails.

Interpreting VaR

Using this statistical framework, we can express potential losses in actual monetary terms, making risk more tangible and actionable for portfolio managers and stakeholders. For example:

“With a 95% confidence level (or 5% significance level), we estimate that your portfolio’s maximum loss will not exceed $23,000 over the next week.”

This interpretation is more useful than abstract statistical measures because it: 1. Provides a concrete dollar value 2. Specifies a clear time horizon 3. Quantifies the confidence level 4. Is easily communicated to non-technical stakeholders

\[ \text{Prob}(\delta \pi\leq -\text{VaR}) = 1-\text{Confi. Lvl.} \] where $\delta \pi$ means the change of portfolio.

Here’s how you translate the time interval \[ \sigma_{\text{n-day}} = \sigma_{\text{daily}}\sqrt{n}\\ \mu_{\text{n-day}} = \mu_{\text{daily}}n \]

Parametric VaR

This is the formula of $\text{VaR}$ \[ \text{VaR} = |(\underbrace{\mu \delta t}_{\text{from } \mu_{\text{daily}}n} -\underbrace{\sigma \sqrt{\delta t}}_{\text{from }\sigma_{\text{daily}}\sqrt{n}})F^{-1}(1-C)| \] where $F^{-1}$ is the inverse cumulative (mostly we use normal) distribution function.

As an example, our portfolio has annual mean of 0.14, annual volatility of 0.32, we would like to know the 99% 1-Week VaR. That would be $$ {1w} =|( { {}n} -{_{}})|

$$ So we can say this portfolio has risk of losing at least $8.1\%$ in a most recent week horizon.

Imagine a long-only portfolio with three stocks, we will calculate the portfolio pnl.

# Generate synthetic daily returns for three stocks
np.random.seed(42)  # For reproducibility
n_days = 1000  # About 4 years of trading days

# Create date range
dates = pd.date_range(
    start='2020-01-01',
    end='2023-12-31',
    freq='B'  # Business days
)[:n_days]

# Generate correlated returns for three stocks
# Define parameters for each stock (mean, std)
stock_params = {
    'AAPL': (0.0012, 0.02),  # Higher return, moderate risk
    'LULU': (0.0010, 0.025), # Moderate return, higher risk
    'C': (0.0008, 0.018)     # Lower return, lower risk
}

# Create correlation matrix (realistic correlations between stocks)
correlation_matrix = np.array([
    [1.0, 0.5, 0.4],
    [0.5, 1.0, 0.3],
    [0.4, 0.3, 1.0]
])

# Generate correlated normal random variables
L = np.linalg.cholesky(correlation_matrix)
uncorrelated_returns = np.random.normal(size=(n_days, 3))
correlated_returns = uncorrelated_returns @ L.T

# Scale returns according to desired parameters and add mean
returns_data = {}
for i, (stock, (mu, sigma)) in enumerate(stock_params.items()):
    returns_data[stock] = correlated_returns[:, i] * sigma + mu

# Create DataFrame with the returns
portfolio_returns = pd.DataFrame(returns_data, index=dates)

# Define portfolio weights (50% AAPL, 30% LULU, 20% C)
weights = np.array([0.5, 0.3, 0.2])

# Calculate portfolio returns
portfolio_returns[c.PORTFOLIO_RETURN] = portfolio_returns.dot(weights)

# Calculate basic statistics
mean_return = portfolio_returns[c.PORTFOLIO_RETURN].mean()
std_dev = portfolio_returns[c.PORTFOLIO_RETURN].std()
skewness = portfolio_returns[c.PORTFOLIO_RETURN].skew()
kurtosis = portfolio_returns[c.PORTFOLIO_RETURN].kurtosis()

logger.info("Portfolio Statistics:")
logger.info(f"Mean Daily Return: {mean_return:.4%}")
logger.info(f"Daily Volatility: {std_dev:.4%}")
logger.info(f"Skewness: {skewness:.4f}")
logger.info(f"Kurtosis: {kurtosis:.4f}")
logger.info(f"Annualized Return: {(1 + mean_return)**252 - 1:.4%}")
logger.info(f"Annualized Volatility: {std_dev * np.sqrt(252):.4%}")

2025-06-22 18:56:32,750 - __main__ - INFO - Portfolio Statistics:
2025-06-22 18:56:32,750 - __main__ - INFO - Mean Daily Return: 0.2008%
2025-06-22 18:56:32,751 - __main__ - INFO - Daily Volatility: 1.6309%
2025-06-22 18:56:32,752 - __main__ - INFO - Skewness: 0.0703
2025-06-22 18:56:32,752 - __main__ - INFO - Kurtosis: -0.1163
2025-06-22 18:56:32,754 - __main__ - INFO - Annualized Return: 65.7861%
2025-06-22 18:56:32,754 - __main__ - INFO - Annualized Volatility: 25.8901%

# Create a figure with two subplots
fig, (ax1, ax2) = plt.subplots(1, 2, figsize=(15, 5))

# Plot 1: Time series of returns
portfolio_returns[c.PORTFOLIO_RETURN].plot(ax=ax1)
ax1.set_title('Portfolio Returns Over Time')
ax1.set_xlabel('Date')
ax1.set_ylabel('Return')
ax1.grid(True)

# Plot 2: Distribution of returns with normal distribution overlay
returns = portfolio_returns[c.PORTFOLIO_RETURN].values
ax2.hist(returns, bins=50, density=True, alpha=0.7, color='skyblue')
ax2.set_title('Distribution of Portfolio Returns')
ax2.set_xlabel('Return')
ax2.set_ylabel('Density')
ax2.grid(True)

# Add normal distribution overlay
x = np.linspace(returns.min(), returns.max(), 100)
y = stats.norm.pdf(x, returns.mean(), returns.std())
ax2.plot(x, y, 'r-', lw=2, label='Normal Distribution')
ax2.legend()

plt.tight_layout()
plt.show()

# Additional plot: QQ plot to check normality
fig, ax = plt.subplots(figsize=(8, 8))
stats.probplot(returns, dist="norm", plot=ax)
ax.set_title("Q-Q Plot of Portfolio Returns")
plt.grid(True)
plt.show()

para_95_1d_var = calculate_parametric_var(
    value=MILLION,
    confidence=PERCENTILE_95,
    mu=portfolio_returns[c.PORTFOLIO_RETURN].mean(),
    sigma=portfolio_returns[c.PORTFOLIO_RETURN].std(),
)
logger.info("portfolio of %s value, 1D 95%% VaR: %s", MILLION, para_95_1d_var)

2025-06-22 18:56:33,576 - __main__ - INFO - portfolio of 1000000.0 value, 1D 95% VaR: 28834.374298083403

Age-Weighted Historical Simulation

We can use exponential weights to reduce Ghost effect of VaR, which is a phenomenon that VaR’s sudden jump without any obvious recent volatility.

age_weights = age_weight_series(num_period=len(portfolio_returns), decay_param=0.99)

age_weights.index = portfolio_returns.index.sort_values(ascending=False)

aged_weighted_return = portfolio_returns[c.PORTFOLIO_RETURN].dropna() * age_weights
aged_weighted_return.plot()

Monte Carlo Simulation for VaR

Monte Carlo simulation is another powerful approach for calculating VaR. It involves: 1. Generating many random scenarios based on the portfolio’s statistical properties 2. Simulating potential future values of the portfolio 3. Calculating the loss distribution 4. Finding the VaR at the desired confidence level

This method is particularly useful when: - The portfolio contains complex instruments (e.g., options) - Returns are not normally distributed - There are non-linear relationships between risk factors

# Perform Monte Carlo simulation for VaR calculation
np.random.seed(42)  # For reproducibility
n_simulations = 10000
simulation_horizon = 1  # 1 day

# Get portfolio parameters
mu = portfolio_returns[c.PORTFOLIO_RETURN].mean()
sigma = portfolio_returns[c.PORTFOLIO_RETURN].std()

# Generate random scenarios
simulated_returns = np.random.normal(
    loc=mu,
    scale=sigma,
    size=n_simulations
)

# Calculate portfolio values
initial_value = MILLION  # $1 million portfolio
simulated_values = initial_value * (1 + simulated_returns)
simulated_pnl = simulated_values - initial_value

# Calculate VaR at different confidence levels
confidence_levels = [0.90, 0.95, 0.99]
var_levels = {}

for conf in confidence_levels:
    var = -np.percentile(simulated_pnl, (1 - conf) * 100)
    var_levels[conf] = var
    logger.info(f"{conf*100}% 1-day Monte Carlo VaR: ${var:,.2f}")

# Plot the distribution of simulated P&L
plt.figure(figsize=(10, 6))
sns.histplot(simulated_pnl, bins=50, kde=True)
plt.title('Distribution of Simulated Portfolio P&L')
plt.xlabel('Profit/Loss ($)')
plt.ylabel('Frequency')

# Add vertical lines for VaR levels
colors = ['g', 'y', 'r']
for (conf, var), color in zip(var_levels.items(), colors):
    plt.axvline(-var, color=color, linestyle='--', 
                label=f'{conf*100}% VaR: ${var:,.0f}')

plt.legend()
plt.show()

2025-06-22 18:56:33,877 - __main__ - INFO - 90.0% 1-day Monte Carlo VaR: $19,068.32
2025-06-22 18:56:33,878 - __main__ - INFO - 95.0% 1-day Monte Carlo VaR: $24,981.39
2025-06-22 18:56:33,879 - __main__ - INFO - 99.0% 1-day Monte Carlo VaR: $35,838.19

Comparison of VaR Methods

We have explored three main approaches to calculating VaR:

Parametric VaR
- Assumes normal distribution of returns
- Fast and simple to compute
- May underestimate risk during market stress
- Best for linear portfolios with normally distributed returns
Historical Simulation
- Uses actual historical data
- No distribution assumptions
- Limited by available historical data
- May not capture regime changes
- Can be improved with age-weighting (as demonstrated above)
Monte Carlo Simulation
- Flexible and can handle complex portfolios
- Can incorporate various distribution assumptions
- Computationally intensive
- Results depend on quality of simulation model

Best practices suggest using multiple VaR methods and comparing their results to get a more complete picture of portfolio risk. Each method has its strengths and limitations, and using them in combination provides better risk insights.

confidence_level = 0.95
np.percentile(aged_weighted_return.dropna(), (1 - confidence_level) * 100)

-3.612639515008725e-05